Search results for "Decision tree learning"

showing 10 items of 13 documents

A Simple Method to Predict Blood-Brain Barrier Permeability of Drug- Like Compounds Using Classification Trees

2017

Background: To know the ability of a compound to penetrate the blood-brain barrier (BBB) is a challenging task; despite the numerous efforts realized to predict/measure BBB passage, they still have several drawbacks. Methods: The prediction of the permeability through the BBB is carried out using classification trees. A large data set of 497 compounds (recently published) is selected to develop the tree model. Results: The best model shows an accuracy higher than 87.6% for training set; the model was also validated using 10-fold cross-validation procedure and through a test set achieving accuracy values of 86.1% and 87.9%, correspondingly. We give a brief explanation, in structural terms, o…

0301 basic medicineQuantitative structure–activity relationshipComputer scienceDatasets as TopicQuantitative Structure-Activity Relationshipcomputer.software_genre01 natural sciencesPermeability03 medical and health sciencesMolecular descriptorDrug DiscoveryInternational literatureComputer SimulationTraining setDecision tree learningDecision Trees0104 chemical sciences010404 medicinal & biomolecular chemistry030104 developmental biologyPharmaceutical PreparationsBlood-Brain BarrierTest setData miningBlood brain barrier permeabilitycomputerAlgorithmsDecision tree modelMedicinal Chemistry
researchProduct

Prediction of Chromatin Accessibility in Gene-Regulatory Regions from Transcriptomics Data

2017

AbstractThe epigenetics landscape of cells plays a key role in the establishment of cell-type specific gene expression programs characteristic of different cellular phenotypes. Different experimental procedures have been developed to obtain insights into the accessible chromatin landscape including DNase-seq, FAIRE-seq and ATAC-seq. However, current downstream computational tools fail to reliably determine regulatory region accessibility from the analysis of these experimental data. In particular, currently available peak calling algorithms are very sensitive to their parameter settings and show highly heterogeneous results, which hampers a trustworthy identification of accessible chromatin…

0301 basic medicineScienceComputational biologyRegulatory Sequences Nucleic AcidBiologycomputer.software_genreArticleEpigenesis Genetic03 medical and health sciencesDatabases GeneticHumansEpigeneticsComputational modelDeoxyribonucleasesMultidisciplinarySequence Analysis RNAGene Expression ProfilingDecision tree learningQRSequence Analysis DNAChromatinChromatinGene expression profilingIdentification (information)030104 developmental biologyGene Expression RegulationMedicineData miningPrecision and recallPeak callingcomputerAlgorithmsScientific reports
researchProduct

Improving the Competency of Classifiers through Data Generation

2001

This paper describes a hybrid approach in which sub-symbolic neural networks and symbolic machine learning algorithms are grouped into an ensemble of classifiers. Initially each classifier determines which portion of the data it is most competent in. The competency information is used to generated new data that are used for further training and prediction. The application of this approach in a difficult to learn domain shows an increase in the predictive power, in terms of the accuracy and level of competency of both the ensemble and the component classifiers.

Artificial neural networkbusiness.industryComputer scienceTest data generationDecision tree learningDisjunctive normal formcomputer.software_genreMachine learningDomain (software engineering)ComputingMethodologies_PATTERNRECOGNITIONProblem domainComponent (UML)Classifier (linguistics)Data miningArtificial intelligencebusinesscomputer
researchProduct

Computational identification of chemical compounds with potential anti-Chagas activity using a classification tree

2021

Chagas disease is endemic to 21 Latin American countries and is a great public health problem in that region. Current chemotherapy remains unsatisfactory; consequently the need to search for new drugs persists. Here we present a new approach to identify novel compounds with potential anti-chagasic action. A large dataset of 584 compounds, obtained from the Drugs for Neglected Diseases initiative, was selected to develop the computational model. Dragon software was used to calculate the molecular descriptors and WEKA software to obtain the classification tree. The best model shows accuracy greater than 93.4% for the training set; the tree was also validated using a 10-fold cross-validation p…

Chagas diseaseComputer scienceTrypanosoma cruziAntiprotozoal AgentsQuantitative Structure-Activity RelationshipBioengineeringLigandsMachine learningcomputer.software_genre01 natural sciencesConstant false alarm rateSoftwareMolecular descriptorDrug DiscoveryChagas Diseaseclassification treeVirtual screeningMolecular Structure010405 organic chemistrybusiness.industryDecision tree learningGeneral Medicinevirtual screening0104 chemical sciences010404 medicinal & biomolecular chemistryIdentification (information)Tree (data structure)Anti-chagasic actionTest setMolecular MedicineArtificial intelligencebusinesscomputerSoftware
researchProduct

The predictive power of game-related statistics for the final result under the rule changes introduced in the men’s world water polo championship: a …

2019

The objectives of this study were (i) to compare water polo game-related statistics by match outcome (winning and losing teams) after the application of the new rules, and (ii) to develop a classif...

Computer scienceDecision tree learningsports05 social sciencesPhysical Therapy Sports Therapy and Rehabilitation030229 sport sciencesWater poloOutcome (game theory)050105 experimental psychology03 medical and health sciences0302 clinical medicineStatisticsPredictive power0501 psychology and cognitive sciencesOrthopedics and Sports MedicineChampionshipsports.sports_positionInternational Journal of Performance Analysis in Sport
researchProduct

Land cover classification of VHR airborne images for citrus grove identification

2011

Abstract Managing land resources using remote sensing techniques is becoming a common practice. However, data analysis procedures should satisfy the high accuracy levels demanded by users (public or private companies and governments) in order to be extensively used. This paper presents a multi-stage classification scheme to update the citrus Geographical Information System (GIS) of the Comunidad Valenciana region (Spain). Spain is the first citrus fruit producer in Europe and the fourth in the world. In particular, citrus fruits represent 67% of the agricultural production in this region, with a total production of 4.24 million tons (campaign 2006–2007). The citrus GIS inventory, created in…

EngineeringGeographic information systemDatabasebusiness.industryDecision tree learningCadastreFeature extractionDecision treeLand covercomputer.software_genreAtomic and Molecular Physics and OpticsComputer Science ApplicationsSupport vector machineIdentification (information)Computers in Earth SciencesbusinessEngineering (miscellaneous)computerCartographyISPRS Journal of Photogrammetry and Remote Sensing
researchProduct

Multifactorial combinations predicting active vs inactive stages of change for physical activity in adolescents considering built environment and psy…

2018

GerontologyMaleHealth (social science)AdolescentCross-sectional studyGeography Planning and DevelopmentHealth BehaviorPhysical activityMEDLINELevel design03 medical and health sciencesSocial support0302 clinical medicineSurveys and QuestionnairesHumans030212 general & internal medicineBuilt EnvironmentExerciseBuilt environmentDecision tree learningPublic Health Environmental and Occupational HealthSocial Support030229 sport sciencesCross-Sectional StudiesSpainEnvironment DesignFemalePsychologyPsychosocialHealthplace
researchProduct

Estimating feature discriminant power in decision tree classifiers

1995

Feature Selection is an important phase in pattern recognition system design. Even though there are well established algorithms that are generally applicable, the requirement of using certain type of criteria for some practical problems makes most of the resulting methods highly inefficient. In this work, a method is proposed to rank a given set of features in the particular case of Decision Tree classifiers, using the same information generated while constructing the tree. The preliminary results obtained with both synthetic and real data confirm that the performance is comparable to that of sequential methods with much less computation.

Incremental decision treeComputer sciencebusiness.industryDecision tree learningRank (computer programming)Decision treePattern recognitionFeature selectionMachine learningcomputer.software_genreSet (abstract data type)Tree (data structure)Feature (machine learning)Artificial intelligencebusinesscomputer
researchProduct

Deterministic Linkage as a Preceding Filter for Other Record Linkage Methods

2015

Deterministic record linkage (RL) is frequently regarded as a rival to more sophisticated strategies like probabilistic RL. We investigate the effect of combining deterministic linkage with other linkage techniques. For this task, we use a simple deterministic linkage strategy as a preceding filter: a data pair is classified as ‘match' if all values of attributes considered agree exactly, otherwise as ‘nonmatch'. This strategy is separately combined with two probabilistic RL methods based on the Fellegi–Sunter model and with two classification tree methods (CART and Bagging). An empirical comparison was conducted on two real data sets. We used four different partitions into training data a…

Linkage (software)education.field_of_studyComputer scienceDecision tree learningPopulationProbabilistic logiccomputer.software_genreFilter (higher-order function)Expectation–maximization algorithmComputer Science (miscellaneous)Data miningeducationcomputerAlgorithmRecord linkageTest dataInternational Journal of Information Technology & Decision Making
researchProduct

Comparing Boosting and Bagging for Decision Trees of Rankings

2021

AbstractDecision tree learning is among the most popular and most traditional families of machine learning algorithms. While these techniques excel in being quite intuitive and interpretable, they also suffer from instability: small perturbations in the training data may result in big changes in the predictions. The so-called ensemble methods combine the output of multiple trees, which makes the decision more reliable and stable. They have been primarily applied to numeric prediction problems and to classification tasks. In the last years, some attempts to extend the ensemble methods to ordinal data can be found in the literature, but no concrete methodology has been provided for preference…

Ordinal dataBoosting (machine learning)Preference learningEnsemble methodsComputer sciencebusiness.industryDecision tree learningDecision treesDecision treeLibrary and Information SciencesMachine learningcomputer.software_genreEnsemble learningBoostingMathematics (miscellaneous)RankingPattern recognition (psychology)Psychology (miscellaneous)Artificial intelligencePreference learningStatistics Probability and UncertaintybusinesscomputerRankings
researchProduct